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Abstract 

We study the L\ minimization problem with ad- 
ditional box constraints. We motivate the prob- 
lem with two different views of optimality con- 
siderations. We look into imposing such con- 
straints in projected gradient techniques and pro- 
pose a worst case linear time algorithm to per- 
form such projections. We demonstrate the mer- 
its and effectiveness of our algorithms on syn- 
thetic as well as real experiments. 

1 Introduction 

In the domain of constrained optimization, it is well un- 
derstood that L 2 norm penalty imposes smoothness con- 
straint while the L\ norm imposes sparsity lfT2l . Lately, 
sparse representations have been shown to be extremely ef- 
ficient in encoding specific kinds of data, mainly, obeying 
power decay law in some transform space e.g. DCT etc. 
Donoho 1 6 ] provided sufficient conditions for obtaining an 
optimal Li-norm solution which is sparse. Recent work on 
compressed sensing [16) further explores how L\ con- 
straints can be used for recovering a sparse signal sampled 
below the Nyquist rate. 

L\ regularized maximum likelihood can be cast as a con- 
strained optimization problem. Although standard algo- 
rithms such as interior-point methods [15, 9 1 offer powerful 
theoretical guarantees (e.g., polynomial-time complexity, 
ignoring the cost of evaluating the function), these meth- 
ods typically require at each iteration the solution of a large, 
highly ill-conditioned linear system; which are potentially 
very difficult and expensive. 

This paper explores the behavior of L\ constraint optimiza- 
tion under the presence of bound constraints. Traditional 
L\ constraint assumes an infinite upper bound on the mag- 
nitude of the predicates. The idea of upper bounds can be 
explained very easily by a very simple example. Suppose 
we want to send a signal of length n which is a combined 



signal originating from k different sources. Suppose at the 
receiver side we have k sets of receivers to decode the en- 
tire signal of length n. Based on the receiver set i G {1, k}, 
the peak signal strength which the receivers can handle can 
be different. This kind of problem, is not handled by tra- 
ditional Li projection. An illustration for the proposed 
problem, with 2 sets of receivers is shown in Fig. [T] As- 
suming that transmitting O's or upper bounds cost just 1 
bit, the effective sparsity, assuming 8 bits per element, is 
(5*8+3* l)/64 for Li , and (2 * 8+6 *l)/64 for upper bounded 
L\ . We motivate the inclusion of box constraints by look- 




Figure 1: Left: L\ projection and right: upper bounded 
L\ projection for the same norm bound. Two colors repre- 
sent different set of receivers with different upper bounds 
represented by black lines. 

ing at the problem from two different settings. 
Optimality Gap: Let us consider an unconstrained prob- 



lem 

min ||x - v||i + V \i\xi\ + C T (1 - x) + 7 T (x - b) 

X Z ' 

where we have introduced the two bound constraints [1, b] 
into the cost function. This can be slightly modified to a 
function of 2 variables such that 

min z T z + y^Xi\xi\ + C T (1 - x) + 7 T (x - b), 

x.z z ' 

s.t. z = x — V 
The Lagrangian for this problem can now be written as 
L(x,z,/3) = z T z + ^A,|^|+C T (l-x) 

+7 T (x-b)+/3 T (x-v-z) 
The dual function is given by 

inf L(x, z, /3) = inf (z T z - /3 T z) + 

x.z z 

inf(^A i |x i | + (/3-C + 7) T x) 

X Z ' 

i 

-/3 T v + C T 1 - 7 T b 

f-^-/3 T v + C T l-7 T b + < A, 

1 — oo otherwise 

The Lagrange dual can now be written as 

max G((3) = - ^ - (3 T v + C T 1 - l T h 
s.t. (/3-C + 7)i<A, 
A change of variables >lx = (/3 — C + 7) leads to 

max G(fi) = ii v 

fii<Xi 4 Z 

«-7) T (C-7) 
4 

- (C-7) T v + C T l-7 T b 

Now the duality gap for the bounded problem can be writ- 
ten as 

V = ||x-v||I + 5>W + 

C T (l-x)+ 7 T (x-b)-G(/ i ) 
= ^li-(C-7) T (x-v) + 

(2/x + (g - 7 )) r (C ~ 7) 
4 

where r?/,i is the duality gap for the L\ problem without 
the bound constraints. For X{ fixed at its upper bound 
7 > and C, = 0. Under such a condition 

, = , L1 - 7 >-b)- ^-^ 



As long as 7 T (v - b) + (2/x ~ t)Tt > the duality gap 
is reduced as compared to simple L\ minimization. Since 
7 T (v — x) is always positive (shown later in Sec. [3]), a suf- 
ficient condition for reduction in duality gap is > ^, 
where \i is the dual feasible solution for the L\ problem. 
The optimality of a particular solution is based on the du- 
ality gap and as such any decrement of the gap increases 
the optimality of the solution obtained. This shows that the 
optimality gap for the bound constrained L\ problem can 
be made arbitrarily closer to zero compared to the similar 
unbounded L\ problem. 

Degrees of freedom for upper bounded problem: We 

study the degrees of freedom of the upper bounded L\ pro- 
jection problem in the framework of Stein's unbiased risk 
estimation (SURE) 1 14 ]. As shown by Zou et al. C3 the 
number of non-zero coefficients is an unbiased estimate for 
the degrees of freedom (DF) of the optimization scheme. 
The idea of upper bounds can be explained very easily by 
a very simple example. Suppose we want to send a signal 
of length n which is a combined signal originating from 
k different sources. Suppose at the receiver side we have 
k sets of receivers to decode the entire signal of length 
n. Based on the receiver set i £ {1, k}, the peak signal 
strength which the receivers can handle can be different. 
This kind of problem, is not handled by traditional L\ pro- 
jection. Assuming that the lower bound is zero (e.g. for 
electrical signals), transmitting upper bounds can cost just 
1 bit. The effective sparsity, can be improved by send- 
ing 1 bit for all the elements fixed at their corresponding 
bounds, and sending 8 bit real numbers for all the remain- 
ing non-zero entities as compared to sending 8 bit reals for 
all non-zero entities. We conjecture that the degree of free- 
dom for the bounded Li problem is bounded below that 
for the unbounded L\ problem, and hence can provide in- 
creased sparsity in terms of the bounds. 




Figure 2: SURE criteria for L\ (blue) and upper bounded 
Li (red). 

Let us again assume the simple estimation problem y = 
+ e. Now for estimating the SURE criterion, we esti- 
mate x and then generate the a new set of observation y. 
The covariance between the terms y and y is a scaled mea- 
sure for the DF. Fig. [2] shows the covariance estimate for 
Li compared to upper bounded L\. Upper bounded L\ is 



uniformly bounded below the DF for only L\. This also 
emphasizes the conjecture for bounded constrained Li, the 
predicates which are fixed at their consecutive upper bound 
can be considered to fixed and as such do not contribute to 
the model complexity. 

2 Separable Quadratic Problems 

The problem of separable quadratic programming with lin- 
ear bound constraints was first considered by Megiddo 
et al. fiTTl . They proposed linear time solution to the 
generic problem by Lagrangian relaxation based on the 
multidimensional search procedure of ifTOl . We introduce 
a novel linear time algorithm for gradient projection based 
norm minimization problem. Our starting point is an effi- 
cient method for projection onto the probabilistic simplex, 
with additional upper bound constraints. For infinite upper 
bound, this is the same method as proposed by numerous 
authors, namely Gafni et al.0, Bertsekas Q, Crammer et 
al. 0, and more recently by ShalevSchwartz et al. fT3l and 
Duchi et al. |7|. The basic intuition is that once the vector 
to be projected is ordered then the projection can be cal- 
culated exactly in linear time. Duchi et al. |7 ] proved the 
similarity of L\ projection to the simplex projection, al- 
though the problem can be traced back to a special case of 
separable quadratic problem tackled by Megiddo et al. fill . 
Although the projection step is linear, the sorting/ordering 
step is still 0(n log n). We contest that once the upper 
bounds are introduced, the ordering does not remain as sim- 
ple as the previous works. In this paper we propose a linear 
time algorithm which orders the difference of the bounds 
from the gradient vector to compute the projections on the 
norm constraint. 

3 Upper bounded Simplex Projection 

The most basic projection task we consider can be formally 
described as the following optimization problem, for v G 

1 n 
min -||x — v || § S-t. {x G H : 7 x« < z, =^ x ^ b} 

i=i 

(1) 

where G M n is the vector of all zeros, b G is 
the vector of upper bounds and (p ^ q) denotes that 
Pi < Qi, Vz G {1, n}. Note that we enforce Y^j=\ v j — z > 
because otherwise x = v is the optimal solution. We as- 
sume that the feasibility of the constraints with respect to 
each other, and the existence of a solution is guaranteed by 
the set ft being non-empty. Also note that if z\ ^ b or 
v =^ b, where 1 is the vector of all ones of size n, then the 
problem reduces to the projection onto the simplex problem 

of 13. 

Claim 1. Xi < Vi, Vi G {1, 2, . . . , n}. 



The norm of v is larger than the constraint z as mentioned 
earlier. Assume there is an optimal projection x*, with its 
element x\ > v\. There exists another solution x*, such 
that x\ = Vi and all other elements same as x*, which is 
bounded and gives a lower value for the cost function, and 
hence is a better solution than x*, which is a contradiction. 

Claim 2. bf fecUve = mmfc, v^, Vi G {1, 2, . . . , n}. 

The maximum value that xi can reach is either the upper 
bound bi or Vj (from Claim [TJ, hence the effective upper 
bound b e i ^ ect%ve is the minimum of bi or v^. 

From this point onwards, we will assume bi = 
^effective ^ nQt men ti onec [ otherwise. The next lemma 

is the extension of Lemma 1 of Duchi et al. | 7 ] for bounded 
projections. 

Lemma l.lfvi> Vj and in optimal solution Xi = then 
Xj = 0, irrespective of ordering ofbi, bj. 

Proof. We need to prove the above lemma for 2 cases. 

[1] bi > bj. In this case, if another solution is constructed 
such that Xi and Xj are switched keeping all other indices 
same, then there is a strict decrease in optimal value, gen- 
erating a contradiction. 

[2] bi < bj. Again let us assume Xj > 0. Let us construct 
another optimal solution x, such that X i — , X j — X j 
A, where A = min(^, xj), and keep all the other indices 
same. It can be easily observed that the norm as well as 
the upper bound constraint are satisfied for x. Now we can 
show that 

New obj value — Old obj value 

= (Vi - Xi) 2 + (Vj - Xj) 2 - (Vi - Xi) 2 - (Vj - Xj) 2 

= (vi - A) 2 + (vj - Xj + A) 2 - v 2 - (vj - Xj) 2 

= A 2 - 2viA + A 2 + 2vjA - 2xjA 

= 2A(A- Xj) + 2(vj -Vi)A < 2(vj-Vi)A < 

which is a contradiction since we constructed a solution 
better than the optimal solution. □ 

4 Euclidean Projection onto the 
box-constrained Li Ball 

We modify the problem studied by Duchi et al. Q to the 
more generic scenario containing the bounds on the pre- 
dicted vector. We need to find the projection of a vector 
vGl n onto a feasible region defined by 

min i||x - v\\l s.t. {x G ft : ||x||i < z, a ^ x ^ b} 

x Zj 

(2) 

Note that the vector v is no longer contained within the 
positive real space. We assume that ft ^ guarantees fea- 
sibility and existence of a solution. The range in which the 



Xj 's should lie can now be characterized as (a) [a,j ,bj] < 0, 
(b) < [dj, bj] and (c) a j < < bj, assuming that < 



ij < bj 



for all cases. 



Intervals not containing 0: Further analysis of the cost 
function in Eq.Q, leads to the observation that conditions 

(a) and (b), are equivalent under a sign flip. This can be 
obtained by observing the following identities: (a) distance 
preservation under sign flip ||v — x||| = ||( — v) — ( — x) |||, 

(b) 1 norm preservation under sign flip ||x||i = ||— x||i and 

(c) range transformation under sign flipx G [a, b] < <^> 
—x G [— b, — a] > 0. For such constraints, we can trans- 
form the bounds such that all the boundaries are positive. 
Once this is done a simple change of variables 



{v,x,b} - a = v,x,b, z- ||a|| 



(3) 



leads to the formulation in Eq.Q with the lower bound 
terms a^s equal to 0. The equivalent simpler problem is ex- 
actly similar to Eq.([T]), since the L\ constraint and the sim- 
plex constraint are same for positive variables. Also note 
that element wise manipulations can be performed when 
all the bounds do not belong to one particular case, without 
altering the form of the equations. 

Interval containing 0: The objective function in Eq.([2]) 
has expression of the form || v — x|| p (p = 2) and norm con- 
straint is equivalent to inclusion in L p norm ball (p = 1). 
Given a candidate solution x G M n , let us define a cat- 
egory of moves which can be used to generate a family 
of candidate solutions. By setting some subset of vector 
components to zero, we can generate corresponding points 
in all orthants, resulting in a family of up to 2 n candidate 
solutions, one in each orthant. We note two properties of 
orthant projection move which are essential for our exposi- 
tion. First, orthant projection move preserves L p norm ball 
inclusion constraint for all p. Second, under orthant projec- 
tion moves of x, || v — x|| p is minimized when x and v are 
in the same orthant. Together, these two properties ensure 
that there is a preferred orthant (determined apriori) where 
optimal solution is guaranteed to lie, as long as none of the 
other constraints are violated. This observation can be used 
to generalize Lemma 3 of Duchi et al. |7 ] to a much wider 
class of problems. For completion and ease of understand- 
ing we state the following lemma: 

Lemma 2. [Lemma 3, Duchi et al . O/ ] Let x 
be an optimal solution ofEq.§2^. Then, X{V{ > 0, Mi. 

Hence, xi has the same sign as Vi. Note that this lemma 
holds true for the upper bounded problem as well, since re- 
placing any variable with does not violate this constraint, 
and hence the proof for the lemma can be exactly applied 
to the more generic case mentioned above. 

Orthant projection also reveals a possible failure mode for 
the above generalization. If there are terms other than those 
of the form of || v — x\\ p and there are constraints other than 
those of the form of inclusion in L p norm ball, then apri- 



ori preferred orthant selection may not be possible. Pre- 
ferred orthant selection allows us to simplify the functional 
form of L p norm terms. In particular, for L\ norm projec- 
tion problem, once problem has been transformed to guar- 
antee that optimal solution lies in first orthant, L\ norm 
constraint becomes equivalent to the simplex constraints, 
Xi > Mi and J27=i x i — z - 

In order to retain this simplification, any generalization in- 
volving terms which are not conducive to orthant projection 
must be explicitly handled. Box constraints are in general 
not conducive, unless interval contains origin. For intervals 
such as a j < < bj, we have the following claim: 



Claim 3. \xA G 



{o,H} 
{o,|&|} 



Vi < 
Vi > 



Based on the above discussions we can write the generic 
upper bounded L\ norm projection problem as, given any 
v G M n , take the absolute value of the elements and trans- 
form it to v G R+, transform [a, b] =>• [0,b] based on 
claim. [3] find x by solving 

minj||x-v||! s.t. {x G Q : ||x||i < z, ^ x ^ b} 

(4) 

and return the final projection as x = x. * sign(v). 

Identifying the simplex projection as well as L\ projection 
as the same problem leads us to study the unified problem 
mentioned above in Eq.Q. 

The Lagrangian for the above optimization problem 
(Eq.Q) can be written as 




z -Cx- 7 .(b-x) (5) 



Differentiating with respect to X{ and comparing to zero 
gives the first order optimality condition, 



dxi 



+ - d + 7t = 



(6) 



The first complementary slackness KKT condition (21, im- 
plies that x = 0, when Vi + Q = 6i. Since, Q > 0, hence 
Xi = whenever vi < 9. The second complementary 
slackness KKT condition implies that < Xi < bi, means 
d = 0, li = and 



Xi — Vi + = Q 



(7) 



The addition of finite upper bound leads to the next com- 
plementary slackness condition, namely, when the value of 
Xi reaches it maximum value bi, ji > and 



bi - Vi + + ji = 
Claim 4. xi = bi implies vi > bi. 



(8) 



Note that the converse is not generally true, that is V{ > bi 
does not imply that Xi = bi, since it can still be lower than 
the upper bound. 

Corollary 4. V{ > bi and 7i > implies Xi = bi. 



v || I is that the con- 



One important aspect of the cost ||x 
tribution of each Xi to the total cost is dependent on the 
distance Vi — xi. From this point onwards we will as- 
sume that the upper bound term bi = min(^, bi), such that 
Vi — bi > 0. Since each Xi is bounded to be less than bi, 
hence Vi — bi can be thought to be the relative weight deter- 
mining the order in which a^'s should be changed to meet 
the norm constraint. This ordering can also be argued from 
the fact that the magnitude of the gradient with respect to 
Xi is determined by the quantity Vi — bi. 

In Lemma [TJ we have shown that even for upper bounded 
simplex projection problem (after restriction to first or- 
thant), such constraint ordering is possible for constraints 
Xi > 0, and is determined by v. In the next lemma we 
show that similar constraint ordering is possible for upper 
bound constraints X{ < bi, and is determined by (v — b) 
which is one of our key contributions and forms the basis 
of the proposed efficient algorithm. Based on the above 
observations we write a modified version of the lemma 2. 
from Shalev-Shwartz et al. fT3ll . 

Lemma 3. Let x be an optimal solution of Eq.^Q. Let i 



and j be two indices such that (yi 
Xi = bi then xj = bj as well. 



bi) < (vj 



Proof. From Eq.([8]), whenever xi = bi, then Vi 
where 7i > 0. Hence 



-bi = 6>+ 7i 



Vi 



h > 



Vj — bj > Vi — bi > 
vj -bj>6 



■7i, 



since 7i > 
given 

such that 7j > 
from Corollary 4. □ 



4.1 Worst case strongly linear time algorithm 

We now propose an algorithm with strongly linear time 
worst case complexity which is asymptotically fastest pos- 
sible. It is based on dependence between 6 and z along 
regularization path. We have already shown that, in optimal 
solution Xi = whenever Vi < 0, and Xi = bi whenever 
Vi — bi > 6 and equal to Vi — 6 otherwise. Hence, for any 
value of 0, variables x^'s can be divided into three disjoint 
groups. 



Let us denote the sets of indices of X{ 9 s in these groups by 
L, U and C respectively. These sets are functions of 6. Let 
optimal 6 be #* and corresponding sets be L* U* and C*. 
Relation between z and can be expressed as, 



ieu 



bi+JZivi-e) 



iec 



ieu 



iec 



-\c\e 



(10) 

where | . | for a set argument, denotes its cardinality. It is 
evident that z is monotonically decreasing piece-wise lin- 
ear function of 0, with 2n points of discontinuity at vi and 
(vi — bi) values. The pseudo code of our proposed algo- 
rithm is given in Algorithm[T] The algorithm operates upon 
merged v and (v — b) arrays, maintaining source informa- 
tion. 

In the first stage, we find the linear segment corresponding 
to given z. Uncertainty interval [0l, 6r] for is initialized 
with [min(v — b), max(v)] and is subsequently reduced in 
every iteration by bisection at a pivot selected from the el- 
ements of merged v and (v — b) arrays lying in current 
uncertainty interval. For pivot, we use median, found us- 
ing worst case linear time median finding algorithm (U, in 
order to ensure that number of iterations remains 0(log n) 
and that after every iteration, size of uncertainty interval re- 
duces by a constant fraction. Using Eq.([T0| to evaluate z at 
pivot is not efficient enough for overall linear time com- 
plexity, since summations involve O(n) terms every time, 
resulting in 0(n log n) complexity. To rectify this ineffi- 
ciency, apart from S a ii = J27=i v ^ we mamtam two mn ~ 
ning partial sums across all iterations. 

1) Sl = sum of Vi for all elements which are guaranteed to 
be set to zero in optimal solution i.e. Vi < 0l i £ L* . 

2) Sr = sum of (vi — bi) for all elements which are guar- 
anteed to be set to corresponding upper bounds in optimal 
solution i.e. (vi - bi) > R ^> i e U* . 

We also maintain cardinality til and ur of these sets. In 
terms of these partial sums, ( [T0| ) can be expressed as 

Zpivot = S a u - S L - S R - (n-n L - n R ) * pi (U) 
— ^ (vi — bi — ] pi V ot) 

i-0pi V ot<Vi — bi<9 r 

E - 

i:9 L <Vi<0 pivot 



If Zpivot > ztargeu [0pivot,0 R ] becomes new uncertainty 
interval, and Sl and til are updated as 



ifvi < 
ifvi -bi> 
if Vi-bi < 



< Vi 



Fixed at lower limit 
Fixed at upper limit 
Constraints inactive 
(9) 



{S L ,n L }^{S L ,n L }+ Yl i^ 1 } ( 12 > 

i:0L<Vi<9 p ivot 

Otherwise, if Zp ivot < z target , [0 L ,0p ivot ] becomes new 



uncertainty interval, and Sr, and ur are updated as 

{S R ,n R } ^ {S R ,n R } + ^2 {(v* — 6*), 1} 

r-0 p ivot<Vi-bi<6 R 

(13) 

Iterations continue until there are no more points of discon- 
tinuity in the uncertainty interval and L*, U* and C* have 



been found. Now, following modified version of ( 10) can 
be used to evaluate 0* as 



^2j e U* h + ^2 ie c* Vi z target 

|C*| 



(14) 



Algorithm 1 Algorithm for worst case strongly linear time 
projection onto the simplex with finite upper bound. 

REQUIRE vGl n ,bGR n ,0< z target < £" =1 b { 
vvb <— mer^e(v, (v — b)) //maintain source info 

(idxJS L , idx_0 R ) <- (0, 2n - 1) 
Sail <- sum(v) 

(j^Li Sl) i 

(n R ,S R ) <r- 

while idx-0R > idx_0L + 1 do 

Opivot ^— pivot select (wh, idxJ0L, idx-0R) 
partition(wh, idxJ0L, idxJ0R, p i VO t) 
idx-0 pi V ot ^ ^^dcx(^0piy O t) 
Evaluate z p i Vot using (|TTJ> 



if Zpi vo t > Ztarget then 

idx-0L ^— idx-0 ] pi V ot 
Update (Sl, n£) using ( 12) 
else 

idx <- idxJpivot 



Update (Sr,ur) using ( 13 ) 
end if 
end while 



Evaluate 0* using p4|) 

RETURN x corresponding to (9* (Eq(9]). 



5 Experiments 

The first set of experiments are performed for synthetic 
data. We generate labeled data belonging to 2 classes such 
that the probability of the label being 1/0 is distributed ac- 
cording to logistic likelihood p(yi = l|xi,w) = cr(w.Xi), 
where a (a) = 1/(1 + exp(— a)). Additionally, we disturb 
10% of the data labels by introducing false labels based 
on random draws. The ground truth parameter vector w 
is generated from a generalized Gaussian distribution, with 
rejection, such that the individual elements of the vector 
are bounded within ±0.5. Moreover half the entities of w 
are made zeros to generate a sparse vector. The inference 
problem is thus an estimation problem with known upper 
bounds. We minimize the average logistic log loss, and 
project the gradient vector to the convex space. The norm 



constraint is determined as a fixed fraction of the dimen- 
sion of the vector. The estimation error against the iter- 
ations, where the error is denoted as /(w) — /(w*), for 
LI and our method called UB_L1 is shown in Fig.|3jleft). 
Note that the LI method estimates are outside the bounds 
which manifests itself as slower rate of convergence as evi- 
dent from the plots. At convergence UB_L1 estimate seems 
much closer to the ground truth parameter vector than the 
LI estimate. 

Next we explore the run time performance of our algorithm, 
against a standard quadratic programming (QP) implemen- 
tation in MATLAB. Fig. [4] shows the results for such an ex- 
periment. For projecting dense vectors with 1M non-zero 
elements our method takes around 0.22 seconds. 

Food distribution The next experiment is drawn from a 
real world scenario and motivates the upper bound con- 
straints. We start by noting that the problem of food dis- 
tribution can be easily applied to our case. Suppose the 
production of one food item (e.g. chicken) in 40 states in 
the US is provided as the initial vector \Q Assume that r% 
of the total production is put up for sales. 

We would like to find the sales vector x for the 40 states. 
The upper bounds can be obtained from the consumption 
patterns in the previous years. We take production in 2007 
as the new vector v, and the distribution in 2004 as the 
upper bound b. To remove scale differences the upper 
bound is normalized such that ||b|| 2 = 1 1 v 1 1 2 - The norm 
constraint z = ||v||2*r/100. The results for such an ex- 
periment are shown in Fig. [5] As the value of z decreases 
LI forces more and more mass into the dominant elements. 
Our method still tries to satisfy the upper bound constraints, 
which spreads the distribution at the cost of sparsity. As the 
supply decreases, LI tries to bias the distributions among 
the states based on the relative weights of the production 
itself. Our method, on the other hand, applies the demand 
based upper bounds, and biases the distribution in favor of 
the states with maximum disparity between production and 
supply. 

6 Conclusion 

In this paper we extend the idea of L\ constrained gra- 
dient projection under the presence of upper bound con- 
straints. We explore simplex projection with upper bounds 
and bring out the similarities with L\ projection. We derive 
criteria for a-priori determination of sequence in which var- 
ious constraints become active and use such orderings to 
propose an efficient algorithm. The key insight obtained 
from our experiments was that L\ tries to increase the 
dominant elements while putting zeroes for all the others. 
Bound constrained L\, weighs the elements based on their 
distance from the corresponding bound. This case leads 



1 http ://w w w. agcensus .usda.gov/Publications/2007 




Figure 4: Left: comparison of our method against Matlab QP. Blue: run time (in seconds) for QP implementation of 
MATLAB. Red: run time for our linear time method. The horizontal axis runs over the dimension of the input vector v. 
Right: zoomed in red curve. 




Figure 5: Red: Actual sales of Chicken in 40 states in 2007. Blue: our method with upper bounds. Green: LI only. Note 
the small region in the circle which has been enlarged. LI completely misses this region whereas our method still provides 
some value to it. 



to better predictions, specifically in cases which should 
be weighed based on the disparity between the demand 
and supply. The elements with higher disparity get higher 
weight in the predicted distribution vector. 
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